learner-aware teaching
Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints
Inverse reinforcement learning (IRL) enables an agent to learn complex behavior by observing demonstrations from a (near-)optimal policy. The typical assumption is that the learner's goal is to match the teacher's demonstrated behavior. In this paper, we consider the setting where the learner has its own preferences that it additionally takes into consideration. These preferences can for example capture behavioral biases, mismatched worldviews, or physical constraints. We study two teaching approaches: learner-agnostic teaching, where the teacher provides demonstrations from an optimal policy ignoring the learner's preferences, and learner-aware teaching, where the teacher accounts for the learner's preferences. We design learner-aware teaching algorithms and show that significant performance improvements can be achieved over learner-agnostic teaching.
Reviews: Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints
This paper formalizes the problem of inverse reinforcement learning in which the learner's goal is not only to imitate the teacher's demonstration, but also to satisfy her own preferences and constraints. It analyzes the suboptimality of learner-agnostic teaching, where the teacher gives demonstrations without considering the learner's preferences. It then proposes a learner-aware teaching algorithm, where the teacher selects demonstrations while accounting for the learner's preferences. It considers different types of learner models with hard or soft preference constraints. It also develops learner-aware teaching methods for both cases where the teacher has full knowledge of the learner's constraints or does not know it.
Reviews: Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints
The paper proposes a really interesting and novel variant of inverse RL with a nice formalization. The proposed algorithms are suitable. While the reviewers felt that the empirical results were weak (lack of scalability and linear reward function limitation), they thought that this was outweighed by the novelty of the problem and the significance of the contribution.
Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints
Inverse reinforcement learning (IRL) enables an agent to learn complex behavior by observing demonstrations from a (near-)optimal policy. The typical assumption is that the learner's goal is to match the teacher's demonstrated behavior. In this paper, we consider the setting where the learner has its own preferences that it additionally takes into consideration. These preferences can for example capture behavioral biases, mismatched worldviews, or physical constraints. We study two teaching approaches: learner-agnostic teaching, where the teacher provides demonstrations from an optimal policy ignoring the learner's preferences, and learner-aware teaching, where the teacher accounts for the learner's preferences.
Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints
Tschiatschek, Sebastian, Ghosh, Ahana, Haug, Luis, Devidze, Rati, Singla, Adish
Inverse reinforcement learning (IRL) enables an agent to learn complex behavior by observing demonstrations from a (near-)optimal policy. The typical assumption is that the learner's goal is to match the teacher's demonstrated behavior. In this paper, we consider the setting where the learner has its own preferences that it additionally takes into consideration. These preferences can for example capture behavioral biases, mismatched worldviews, or physical constraints. We study two teaching approaches: learner-agnostic teaching, where the teacher provides demonstrations from an optimal policy ignoring the learner's preferences, and learner-aware teaching, where the teacher accounts for the learner's preferences.